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Abstract The aim of this article is to briefly review and make new studies of cor- 
relations and co-movements of stocks, so as to understand the "seasonalities" and 
market evolution. Using the intraday data of the CAC40, we begin by reasserting 
the findings of Allez and Bouchaud (TJ: the average correlation between stocks in- 
creases throughout the day. We then use multidimensional scaling (MDS) in gener- 
ating maps and visualizing the dynamic evolution of the stock market during the day. 
We do not find any marked difference in the structure of the market during a day. 
Another aim is to use daily data for MDS studies, and visualize or detect specific 
sectors in a market and periods of crisis. We suggest that this type of visualization 
may be used in identifying potential pairs of stocks for "pairs trade". 
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Many complex features, including multi-fractal behaviour, of financial markets have 
been studied for a long time, and constitute today a collection of empirical "laws", 
the so-called "stylized facts" J3]. The questions: "How efficient is the market? To 
what extent?" have been long debated on by economists, econometricians and prac- 
titioners of finance |3] . It is now accepted that the market is weakly efficient (at 
least to some extent and in certain time scales), and that several quantities like the 
price returns, volatility, traded volume, etc. do exhibit "seasonal patterns'^; why 
these "market anomalies" appear is, of course, not well-understood. One reason 
for their appearance could be that the markets operate in synchronization with hu- 
man activities and so the financial time series of returns of many assets reveal the 
related statistical "seasonalities". Identifying such anomalies in order to make sta- 
tistical arbitrage is a usual practice. Another related practice is estimating market 
co-movements, which is certainly relevant in several areas of finance, including in- 
vestment diversification |5 1 and risk management [6 1. 

In this paper, we first present some notations, definitions and methods. We then 
review existing results on intraday patterns concerning both individual and collec- 
tive stock dynamics. We compare the cross-sectional "dispersion" of returns and 
its typical evolution during the day, with the intraday pattern of the leading modes 
of the cross-correlation matrix between stock returns, following the studies of Allez 
and Bouchaud [ 1 ]. Then, we make additional plots of the pair-wise cross-correlation 
matrix elements and study their typical evolution during the day. Finally, we use 
multidimensional scaling (MDS) in generating maps and visualizing the dynamic 
evolution of the stock market during the day. When the MDS studies are repeated 
with daily data, we find that it is easier to visualize or detect specific sectors and 
market events. We suggest that this type of plots may be used in identifying poten- 
tial pairs of stocks for "pairs trade". 



2 Some data specifications, notations, and definitions 

In order to measure co-movements in the time series of stock prices, the popular 
Pearson correlation coefficient is commonly used. However, it is now known that 
several factors viz., the statistical uncertainty associated with the finite-size time se- 
ries, heterogeneity of stocks, heterogeneity of the average inter-transaction times, 
and asynchronicity of the transactions may affect the reliability of this estimator. 
The investigation of high-frequency "tick-by-tick" data does enable one to monitor 
market co-movements and price formation in real time. However, high-frequency 
data have the drawback of aggravating the above mentioned factors even further, 
raising the need to adequately evaluate their impact through proper correlation mea- 



1 "The existence of seasonal asset returns may be an indicator of market inefficiencies. . . The pres- 
ence of seasonal returns, however, does not necessitate market inefficiencv"l4l 
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sures, such as the Hayashi-Yoshida estimator [7 1. In this section, we introduce such 
concepts, along with notations and definitions, and also specify the details of the 
datasets used. 

We have considered three data sets. 

• Daily returns: we have used the freely downloadable daily closure prices from 
Yahoo for N = 54 companies in the New York Stock Exchange, over a period 
spanning from January 1, 2008 to May 31, 201 1. 

• Intraday tick-by-tick: N = 40 companies of the CAC40 stock exchange for 
March 2011, between 10:00-16:00 CET. We have purposefully avoided the 
opening and closing hours of the market, so as to avoid certain anomalies. 

• Intraday sampled retuns: Same universe as the tick-by-tick but sampled in bins 
of 5 minutes or 30 minutes. Thus, the total number of 5 minute bins is 72 per day 
and total number of 30 minute bins is 12 per day. The total number of trading 
days in one month is around T = 21. 



2.1 Cross-sectional "dispersion" of the binned data 

In this section we introduce the notations and definitions used by the authors of 
Ref. [ 1 ] for their study of sampled intraday data; we will use the same notations 
when reproducing their results for our own dataset. 

Stocks are labelled by i = 1, . . . ,N, days by t = 1, ... ,T and bins by k = l,...,K. 
The return of stock i in bin k of day t will be denoted as ri(k;t). The temporal 
distribution of stock i in bin k is characterised by its moments: mean jU,(&) and 
standard deviation (volatility) (7,(fc), which are defined as: 

H(k) = ( ri (k;t)) (la) 
af{k) = {r i {k;tf)- i if{k), (lb) 

where averages over days for a given stock and a given bin are expressed with angled 
brackets: (...). 

The cross-sectional "dispersion" of the returns of the N stocks for a given bin k 
in a given day t is as well characterised by its moments: 

H d (k;t) = [n(k;t)] (2a) 
a 2 d {k-t) = [ n {k-tf]~^{k-t), (2b) 

where the averages over the "ensemble" of stocks for a given bin in a given day are 
expressed with square brackets: [...]. We note that \i d (k; t) may be interpreted as the 
"return of an index", equiweighted on all stocks. We will be more interested in the 
average of o]j(k\t) over all days, as a way to characterise the typical intraday evo- 
lution of the "dispersion" between stock returns. Detailed studies of this dispersion 
and other such measures, concerning both stock prices and returns, will be presented 
elsewhere §0. 
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Although the dispersion, described above, indicates the "co-movements" of 
stocks, a more common and direct characterisation is through the standard "cor- 
relation" of returns. In order to measure the correlation matrix of the returns, each 
return is normalised by the dispersion of the corresponding bin, to reduce the in- 
traday seasonality and also take into account the fluctuation of the volatility in the 
considered time period T. Therefore, following the same prescription as in Ref. JT], 
we define: %{k;t) = ri{k\t) / '<Jd(k;t) and study the correlation matrix defined for a 
given bin k: 

PyW - W)W) • () 

The largest eigenvalue of the N x N correlation matrix C(k) composed of the el- 
ements pij(k), is denoted by X\(k) and is equal to the risk of the corresponding 
eigenmode, the "market mode" with all entries positive and close to 1 / \/N. In fact, 
X\ (k)/N can be seen as a measure of the average correlation between stocks. We will 
be interested in the intraday evolution or the bin-dependence of the largest eigen- 
value0. 



2.2 Correlation matrix with tick-by -tick data 

Computing correlations using these intraday data, raises lots of issues concerning 
usual estimators, as already indicated above. Let us assume that we observe T time 
series of prices or log-prices p,-, (z = 1 , . . . , T), observed at times t m (m = 0, . . . ,M). 
The usual estimator of the covariance of prices i and j is the realized covariance 
estimator, which is computed as: 

M 

£ /T (0 = L to ( f ™ ) - Pi ( f >»- 1 ) ) (pj ( f ™ ) - pj ( f m - 1 ) ) ■ 

7/1=1 

The problem is that high-frequency tick-by-tick data record changes of prices 
when they happen, i.e. at times not predefined and not equidistant. Multivariate 
tick-by-tick data are thus asynchronous, contrary to daily close prices for exam- 
ple, which are by construction synchronous for all the assets on a given exchange. 
Using standard estimators without caution, could be one cause for the "Epps effect", 
first observed in [9|, which stated that "correlations among price changes in com- 
mon stocks of companies in one industry are found to decrease with the length of the 
interval for which the price changes are measured." Hence, here we use the Hayashi- 
Yoshida estimator [7 1 also, which takes (part of) the Epps effect into account. There 
are many other estimators that may be used in general, and a comparison of such 
estimators has been performed in Ref. IfTOl . 



2 A similar study about the intraday evolution of the first eigenvector is of great interest and has 
been performed as well in [T]. 
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Hayashi-Yoshida (HY) estimator 

In JTj, the authors introduced a new estimator for the linear correlation coefficient 
between two asynchronous diffusive processes. Given two Ito processes X,Y such 
that 

dX, = llf&t + of dW t x (4) 

AY, = tfdt + of dWf (5) 

d(W x 1 W ¥ ) [ =p t dt, (6) 

and observation times = to < t\ < . . . < f„_i < t„ = T for X, and = sq < s\ < 
■ ■ ■ < s m -\ < s m = T for Y, which must be independent for X and Y, they showed 
that the following quantity: 



Oij =]ti-i,ti]n]sj-i,sj] 

rf =X tj —X t ._ x 
r] =Y S -Y s ._ l , 

is an unbiased and consistent estimator of / of of p,df, as the largest mesh size 
goes to zero. In practice, it amounts to summing every product of increments as soon 
as they share any overlap of time. In the case of constant volatilities and correlation, 
it provides a consistent estimator for the correlation 

Pii — — i \°> 



2.3 Pearson correlation coefficient and correlation matrix with 
daily returns 

In order to study the equal time cross-correlations between stocks, we first denote 
the closure price of stock i in day T by P, (t), and determine the logarithmic return of 
stock i as r,(r) = lnf,(T) — lnfj(T — 1). For the sequence of T consecutive trading 
days, encompassing a given window t with width T, these returns form the return 
vector r'j. In order to characterize the synchronous time evolution of assets, we use 
the equal time Pearson correlation coefficients between assets i and j defined as 
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(9) 



where (...) indicates a time average over the T consecutive trading days included 
in the return vectors. These correlation coefficients fulfill the usual condition of 
— 1 < Pi) < 1 an d form anN xN correlation matrix C, which serves as the basis of 
further analyses ifTTlllZl . 

For analysis, the data is divided time-wise into M windows (t — 1,2, ...,M) of 
width T, corresponding to the number of daily returns included in the window. The 
consecutive windows may be overlapping/non-overlapping with each other, the ex- 
tent of which is dictated by the window step length parameter 8t, describing the 
displacement of the window, measured also in trading days. The sizes of window 
width T, and window step width 8t , are to be chosen cleverly: for example, T must 
be long enough to grasp any signal with a certain statistical power, but not cover too 
long a period over which the signal could have varied. 



2.4 Distance matrix 

To obtain "distances", a non-linear transformation 



is used, with the property 2 > dn > 0, forming an N x N distance matrix D', such 
that all distances are "ultrametric". The concept of ultrametricity is discussed in 
detail by Mantegna 11131 . Out of the several possible ultrametric spaces, the sub- 
dominant ultrametric is opted for due to its simplicity and remarkable properties. 
The choice of the non-linear function is again arbitrary, as long as all the conditions 
of ultrametricity are met. 



2.5 Multidimensional scaling (MDS) 

Multidimensional scaling is a set of data analysis techniques that display the struc- 
ture of "distance"-like data as a "geometrical picture", where each object is repre- 
sented by a point in a multidimensional space. The points are arranged in this space, 
such that the distances between pairs of points have the strongest possible relation to 
the "similarities" among the pairs of objects — two similar objects are represented 
by two points that are close together, and two dissimilar objects are represented 
by two points that are far apart. The space is usually a two- or three-dimensional 
Euclidean space, but may be non-Euclidean and may have more dimensions. 

MDS is a generic term that includes many different types — classified according 
to whether the similarities data are "qualitative" (called non-metric MDS) or "quan- 




(10) 
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titative" (metric MDS). The number of similarity matrices and the nature of the 
MDS model can also classify MDS types. This classification yields classical MDS 
(one matrix, unweighted model), replicated MDS (several matrices, unweighted 
model), and weighted MDS (several matrices, weighted model). For a general intro- 
duction and overview, please see Ref . Ifl4l . 

The collection of objects to be analyzed in our case, is /V stocks, on which a 
distance function is defined using Eq. ( fTob - These distances are the entries of the 
similarity matrix 



D' 



dii dn ■ ■ ■ d2N 



(11) 



\dm d^2 ■ ■ ■ dux J 
Given D', the aim of MDS is to find /V vectors x\ , . . . ,Xn G R d , such that 



;dij VijEN, 



(12) 



where || ■ || is a vector norm. In classical MDS, this norm is typically the Euclidean 
distance metric. 

In other words, MDS tries to find a mathematical embedding of the N objects 
into K D such that distances are preserved. If the dimension D is chosen to be 2 or 3, 
we are able to plot the vectors x\ to obtain a visualization of the similarities between 
the /V objects. It may be noted that the vectors x, are not unique- with the Euclidean 
metric, they may be arbitrarily translated and rotated, since these transformations 
do not change the pairwise distances \xi — jc,-||. 

There are various approaches to determining the vectors x\. Generally, MDS is 
formulated as an optimization problem, where (x\,... ,xn) is found as a minimiza- 
tion of some cost function, such as 



Xi,...,XN 



dijf 



(13) 



KJ 



A solution may then be found by numerical optimization techniques. In our case, 
we used simulated annealing as the optimization procedure. 



3 Results 



3.1 U-effect in volatility 

In financial studies, among the periodicities or "seasonalities" is the "U-effect" |fT31 
[16), which describes the intraday pattern of average volatility o(k) — [Of(fc)] of 
individual stocks: the average volatility is high during the market opening hours, 
then decreases so as to reach a minimum around lunch time, and increases again 
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steadily until the market closes. We show a similar result in Fig. [TJ computed with 
the CAC40 intraday data for the period March, 2011. The average of |/4/(&;f)| is a 
proxy for the "index volatility", and is displayed in Fig. Q]: it also shows a U-shaped 
pattern similar to that of o(k). 




bin k 

Fig. 1 Plots of the average volatility of stocks <j(k), the average cross sectional dispersion (Jj(k) 
and the average absolute value of the index return (\nj(k,t)\) as a function of the 5-minute bins 
denoted by k, from 10h00-16h00 CET, for the period March, 201 1. Courtesy: E. Guevara H. et al 

m 



3.2 The eigenvalues of the correlation matrix and average 
correlations 

The largest eigenvalue X\ of the correlation matrix of stock returns, is well known 
to be associated with the "market mode", i.e. all stocks moving more or less in a 
synchronized manner. We show in the top panel of Fig. [2] the magnitude of X\JN 
computed from Eq. (01 on 5-min data, as a function of the bin k. Interestingly, the av- 
erage correlation clearly increases as time elapses. As mentioned earlier, the quan- 
tity X\/N captures the behavior of the average correlation between stocks, which 
can be seen in the bottom panel of Fig. [2] 
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average correlation 




Fig. 2 Top: Top eigenvalues of the correlation matrix, A, (k) /N, i= 1 , . . . , 7, as a function of the 
5-minute bins denoted by k, from 10h00-16h00 CET, in March, 2011. [5-min sampled prices, 
courtesy E. Guevara H. et al |8|]. Bottom: The largest eigenvalue X\/N (circles) is a proxy for 
the average correlation (plain) [HY correlations for every pair and every bin of every day, then 
averaged over days for visual comfort and comparison with previous figure]. 
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The evolution of the next six eigenvalues Xi(k), i = 2,...,7 is also shown in 
Fig. |2] We see that the amplitudes of these decrease with time. It may be appro- 
priate to quote the authors of Ref . [ 1 1 : "Although by construction the trace of the 
correlation matrix, and therefore the sum of all N eigenvalues is constant (and equal 
to A), this decrease is not a trivial consequence of the increase of X\ ... What we see 
here is that as the day proceeds, more and more risk is carried by the market factor, 
while the amplitude of sectorial moves shrivels in relative terms (but remember that 
the correlation matrix is defined after normalising the returns by the local volatility, 
which increases in the last hours of the day)." 

We also compute using Eq. ([H) the cross-correlation matrices with tick-by tick 
data, for all 72 bins per day and 20 days in a month. The temporal evolution of the 
pairwise average correlation coefficients as a function of bins, for different days, 
and further averaged over all the days, are plotted below in Fig. [3] 



average over pairs 




10 20 30 40 50 60 70 



Fig. 3 Plot of the (pairwise) average correlations as functions of bins k, for different days. Thick 
solid line: Plot of the average correlation coefficients, further averaged over all the days, which 
shows that the average correlation between stocks increases throughout the day. Thick dashed 
lines: Plots of the standard deviations on either side of the average correlation. 



3.3 MDS using intraday data 

In order to visually capture the co-movement of stocks, we used the MDS plots 
of the 40 stocks of the CAC40 index (see list of CAC40 stocks in Table [B, for 
the period of March 2011. We used 30 minute bins to compute the correlations, 
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using the Hayashi-Yoshida estimator. We used the period 10h00-16h00 CET, so as 
to get 12 bins per day for the 22 days. Using the correlation matrices as input, we 
made the distance transformations (using Eq. (fTUt ) to produce the distance matrices. 
These distance matrices were then used as inputs to the standard MDS function in 
MATLAB. We used the method of simulated annealing to optimize the cost function 
of a particular bin. The first bin starts with an initial set of coordinates chosen at 
random; for the following bins, we used the final results of the previous bins as the 
initial states^. The output of the MDS were the coordinates, which were plotted as 
the MDS maps. The coordinates were plotted in a manner such that the centroid of 
the map coincided with the origin (0,0). We then computed the mean distance of all 
the coordinates from the centre, and plotted this measure as a functon of time. 

During the course of any day, since for every bin the correlation matrix changes, 
the MDS map also changes. Just as it is interesting to study how the average cor- 
relation between the stocks varies during the day, we thought it would be also in- 
teresting to study how the MDS map evolves "on an average" during the day. We 
had two choices: (i) Run the MDS algorithm for every bin for 22 days, and take the 
average of the coordinates over all the 22 maps, and plot this map for every bin. (ii) 
Take the average of the correlations over the 22 days for each bin, and plot a single 
MDS map for every bin. We executed both, to see the variations. In choice (i), for 
every bin k we take an average of the coordinates generated by the 22 MDS runs (for 
different days) and plot them stock by stock. Some stocks fluctuate a lot on a day 
to day basis, in the same time bin; others fluctuate less. On the whole we expected 
to see the average structure (clustering) of the market. In choice (ii), we expected to 
see less structure, since when we take the average of correlations over all 22 days, 
and then run the MDS once for every bin, the variances in the correlations disappear 
and so the MDS plots look more uniform. 

3.3.1 Averaged (over days) coordinates in different bins 

We took the average of the coordinates (output of the MDS) of each company over 
all 22 days, for a particular bin. We then plotted the MDS maps using these averaged 
coordinates for the different bins to see the evolution during the day, as shown in 
Fig. |4](for first six bins) and Fig. |5](for last six bins). We find that there is some 
structure, and particular companies always stay together in a cluster or a group. 



3 This is to avoid too drastic a change in the MDS plots from one bin to another, keeping in mind 
that the vectors x,- are not unique- with the Euclidean metric, they may be arbitrarily translated 
and rotated. 
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Fig. 4 MDS plots for bins 1-6. Each point on a plot represents a stock (see list of CAC40 stocks in 
Tablefl}, designated by two coordinates (*,,}>,), i= l,...,N. We took the average of the coordinates 
(output of the MDS) of each company over all 22 days, for a particular bin. We then plotted the 
MDS maps using these averaged coordinates for the different bins to see the evolution during the 
day. 
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Fig. 5 MDS plots for bins 7-12. Each point on a plot represents a stock (see list of CAC40 stocks in 
TablefTJ, designated by two coordinates (x, ,yi), i = 1, . . . , N. We took the average of the coordinates 
(output of the MDS) of each company over all 22 days, for a particular bin. We then plotted the 
MDS maps using these averaged coordinates for the different bins to see the evolution during the 
day. 



3.3.2 Averaged (over days) correlations in different bins 

We also took the average of the correlation coefficients for each pair over all 22 
days, and then used them to generate the MDS plot for a particular bin. We then 
plotted the MDS maps for the different bins to see the evolution during the day, as 
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shown in Fig.[6](for first six bins) and Fig.[7](for last six bins). We find that there is 
less structure than the previous plots (as average of correlations "smoothen out" the 
dissimilarities). The structures of the maps and positions of the companies do not 
change drastically during the course of the day. 
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Fig. 6 MDS plots for bins 1-6. Each point on a plot represents a stock (see list of CAC40 stocks in 
Table[TJ, designated by two coordinates (x/ , )?,), i— 1 , . . . , /V. We took the average of the correlation 
coefficients for each pair over all 22 days, and then used them to generate the MDS plot for a 
particular bin. We then plotted the MDS maps for the different bins to see the evolution during the 
day. 
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Fig. 7 MDS plots for bins 7-12. Each point on a plot represents a stock (see list of CAC40 stocks in 
Table[T}, designated by two coordinates (x, , y{), i = 1 , . . . , N. We took the average of the correlation 
coefficients for each pair over all 22 days, and then used them to generate the MDS plot for a 
particular bin. We then plotted the MDS maps for the different bins to see the evolution during the 
day. 



We further plotted the variation of the mean distance of all the coordinates from 
the centre of the map, over the different bins to see the temporal evolution during 
the day, in Fig. [8] This follows exactly the opposite trend of the average correlations 
as shown in Fig.|2]or Fig. [3]- the mean distance decreases during the day. This result 
is as expected, and not very surprising. 
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mean distance from centre 




Fig. 8 Mean distance of coordinates of all the points (40 stocks) from center of the map, as a 
function of the bin k. There are 12 bins of 30 minutes between 10:00 and 16:00 CET. 



Table 1 RICS list of the stocks in the CAC 40. 



Names 


RICS 


ACCOR FICTIVE 


ACCP.PA 


AIR LIQUIDE 


AIRPPA 


ALCATEL LUCENT 


ALUA.PA 


ALSTOM 


ALSO. PA 


ARCELOR MITTAL FICTIVE 


ISPA.AS 


AXA 


AXA F. PA 


BNP PARIBAS 


BNPP.PA 


BOUYGUES 


BOUY.PA 


CAP GEMINI 


CAPPPA 


PERNOD RICARD 


PER P. PA 


VALLOUREC 


VLLP.PA 


CARREFOUR 


CARR.PA 


PEUGEOT SA 


PL UP. PA 


VEOLIA ENVIRONNE M E NT 


VIE. PA 


CREDIT AGRICOLESA 


CAGR.PA 


PPR 


PRTPPA 


VINCI 


SGEF.PA 


DANONE 


DANO.PA 


PUBLICIS 


PUBP.PA 


VIVENDI 


VIV.PA 


EADS PEA FICTIVE 


EAD-PA 


RENAULT 


RLNA.PA 


EDF 


EDEPA 


SAINT GO BAIN 


SGOB.PA 


ESS1LOR INTERNATIONAL 


ESSLRA 


SANOFI 


SASYJA 


FRANCE TELECOM 


FTE.PA 


SCHNEIDER ELECTRIC SA 


SCHN.PA 


GDF SUEZ 


GSZ.PA 


SOCIETE GENERALE 


SOGN.PA 


LOREAL 


OREPPA 


STMICROELLCTRONIOS PLA FICTTVE 


STMPA 


LVMH 


LVMH. PA 


SUEZ ENVIRONNLMENTSA 


SEVLPA 


LAFARGE 


LAI P. PA 


TECIINIP 


TECRPA 


MIC1IELIN 


MICP.PA 


TOTAL 


TOTF.PA 


NATIXIS 


CNAT.PA 


UNIBAIL-RODAMCO SE 


UNBP.PA 
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3.4 MDS using daily data 

In order to capture the co-movement of stocks visually, we again used the MDS plots 
of 54 stocks from Yahoo daily data, for the period of January 2008-May 201 1. We 
computed the correlations using non-overlapping windows of T consecutive trading 
days, using Eq. (O. The choice of T is important because if T/N is small, then 
according to the Random Matrix Theory we cannot distinguish between noise and 
the true signal. Since MDS needs a full rank correlation matrix, the noise needs to 
be cleaned with appropriate statistical measures before applying MDS. 

As before, using the correlation matrices as input, we made the distance transfor- 
mations (using Eq. < TT~0b > to produce the distance matrices. These distance matrices 
were then used as inputs to the MDS code in MATLAB. We used the method of 
simulated annealing to optimize the cost function of a particular day. The first day 
(time-step) starts with an initial set of coordinates chosen at random; for the follow- 
ing days (time-steps), we used the final results of the previous day (time-step) as the 
initial stat^]. The output of the MDS were the coordinates, which were plotted as 
the MDS maps. The coordinates were plotted in a manner such that the centroid of 
the map coincided with the origin (0,0). We then computed the mean distance of all 
the coordinates from the centre, and plotted this measure as a functon of time. 

In Fig. [9] we plot MDS maps for sample dates: 28/05/2008 (pre-Subprime cri- 
sis), 27/10/2008 (onset of Subprime crisis) and 28/06/2010 (post-Subprime crisis). 
In these plots we do see the difference in the positions of the companies. The posi- 
tion of Lehman brothers in the plot of the MDS during the post-Subprime crisis is 
noteworthy. 

We also plot in Fig. [9] the mean distance of coordinates from center for the period 
01/01/2008 to 31/12/2009. There is certainly a noticeable variation in this entire 
period, and the period of the Subprime crisis can be identified with the low value of 
mean distance. 



This is to avoid too drastic a change in the MDS plots from one time step to another, keeping 
in mind that the vectors x,- are not unique- with the Euclidean metric, they may be arbitrarily 
translated and rotated. We imposed a small penalty in the cost function for deviation from the 
initial state. 
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Fig. 9 The correlation matrices are computed from Yahoo daily closure price data using Eq. l|9) and 
54 trading day window, for the set of 54 companies. The points on each MDS plot represent stocks, 
each designated by two coordinates (x, ,yi), i = 1, . . . , 54. Top-most: MDS plot for date 28/05/2008. 
Top: MDS plot for date 27/10/2008. Bottom: MDS plot for date 28/06/2010. Bottom-most: Mean 
distance of coordinates from center for the two year period 1/01/2008 to 31/12/2009. 
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In order to examine carefully whether any clusters can be identified, we worked 
with a subset of 18 companies. In Fig. [10] and Fig. [TT] we plot MDS maps for 
different sample dates: 03/06/2008, 25/07/2008, and 05/09/2008 (pre-Subprime cri- 
sis); 17/10/2008, 28/1 1/2008 and 13/01/2009 (during Subprime crisis); 24/02/2009, 
07/04/2009, 12/09/2009 and 04/1 1/2009 (post-Subprime crisis). 



• Chevroi 
• Coca-CftS** " 



• JPMorgan 
• BankofArru 



• Cred*@nB9i Fiat 

• JPMorgan 

• Togota „ •*"V»0»*» 

• BankoiAmenca 

• Nissai? Barclays 



•JPMorgan 
• BankofAmerlt^ 1 ^ 



• Rena«IPeuge 



•E>»»h..ron » 

• Coca-Cola # 

• Pepsi » R fia^ge l 

• Fiat 

• JPNkSphofAn*«iB5tf^*ll!MeeBar 

• T ; , N"»Ax; Ba ' cla,s 



• Caca-Cola 
• Pepsi .**P*»tfePt 
• Fiat 

•.BUftSHmenca . D , u , chBBank 
■ Barclays 



Fig. 10 MDS plots for different dates. Top Left: 03/06/2008 Top Right: 25/07/2008 Middle Left: 
05/09/2008 Middle Right: 17/10/2008 Bottom Left: 28/11/2008 Bottom Right: 13/01/2009. The 
correlation matrices are computed from Yahoo daily closure price data using Eq. (|9j and 30 trad- 
ing day window, for the subset of 18 companies. The points on each plot represent stocks, each 
designated by two coordinates (x, , v,), i= 1, .... 18. 
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Fig. 11 MDS plots for different dates. Top Left: 24/02/2009 Top Right: 07/04/2009 Bottom Left: 
12/09/2009 Bottom Right: 04/11/2009. The correlation matrices are computed from Yahoo daily 
closure price data using Eq. ((9} and 30 trading day window, for the subset of 18 companies. The 
points on each plot represent stocks, each designated by two coordinates (jc^jj), i= 1, . . . , 18. 



In these plots we do see the considerable differences in the positions of the com- 
panies. However, it is interesting to follow the positions of certain pairs: 

(i) JP Morgan and Bank of America 

(ii) Nissan and Toyota 

(iii) Chevron and Exxon 

(iv) Pepsi and Coca Cola. 

This type of visual plot may therefore be used in identifying potential pairs of stocks 
for "pairs trade". Such a strategy monitors the performance s of two historically cor- 
related stocks: when the correlation between the two securities temporarily weakens, 
i.e. one stock moves up while the other moves down, the pairs trade strategy would 
be to short the outperforming stock and to long the underperforming one, betting 
that the "spread" between the two would eventually converge. Further analysis is of 
course necessary to devise such a strategy. 

We also find that there is some noticeable clustering effect, e.g. as all the Eu- 
ropean banks are in one cluster and all the European automobiles are in another 
cluster. 
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In this paper, we first reviewed existing results on intraday patterns concerning both 
individual and collective stock dynamics. We studied the cross-sectional "disper- 
sion" of returns and its typical evolution during the day, and found that the average 
volatility is high during the market opening hours, then decreases so as to reach a 
minimum around lunch time, and increases again steadily until the market closes. 
The average of \pLd{k;t)\, which is a proxy for the "index volatility", also displayed 
a U-shaped pattern similar to that of o(k). Studying the intraday pattern of the lead- 
ing modes (eigenvalues) evaluated using the cross-correlation matrix between stock 
returns, we found that the maximum eigenvalue X\ (k) (corresponding to the market 
mode or average correlation) clearly increases as time elapses. However, the evo- 
lution of the next six eigenvalues A, (fc), i = 2, . . . ,7 showed that the amplitudes of 
these decrease with time. Then, we made additional plots of the pair-wise cross- 
correlation matrix elements and studied their typical evolution during the day. Fi- 
nally, we used multidimensional scaling (MDS) in generating maps and visualizing 
the dynamic evolution of the stock market during the day. When the MDS studies 
were repeated with daily data, we found that it was easier to visualize or detect 
specific sectors, strongly correlated pairs and market events. We suggest that this 
type of plots using daily data may be used in designing strategies of "pairs trade" as 
explained earlier, or identifying clusters or detecting market trends. 
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